Fuzzy clustering of time series gene expression data with cubic-spline
نویسندگان
چکیده
Data clustering techniques have been applied to extract information from gene expression data for two decades. A large volume of novel clustering algorithms have been developed and achieved great success. However, due to the diverse structures and intensive noise, there is no reliable clustering approach can be applied to all gene expression data. In this paper, we aim to the feature of high noise and propose a cubic smoothing spline fitted for the time course expression profile, by which noise can be filtered and then groups genes into clusters by applying fuzzy cmeans clustering on the resulting splines (FCMS). The discrete values of radius of curvature are used to compute the similarity between spline curves. Results on gene expression data show that the FCMS has better performance than the original fuzzy c-means on reliability and noise robustness.
منابع مشابه
Microarray Time-Series Data Clustering via Multiple Alignment of Gene Expression Profiles
Genes with similar expression profiles are expected to be functionally related or co-regulated. In this direction, clustering microarray time-series data via pairwise alignment of piece-wise linear profiles has been recently introduced. We propose a k-means clustering approach based on a multiple alignment of natural cubic spline representations of gene expression profiles. The multiple alignme...
متن کاملContinuous Representations of Time-Series Gene Expression Data
We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملFuzzy Clustering Models for Gene Expression Data Analysis
copies of full i tems can be reproduced, displayed or performed, and given to third parties in any format or medium for personal research or study, educational, or not-for-profit purposes without prior permission or charge, provided the authors, ti t le and full bibliographic details are given, as well as a hyperlink and/or URL to the original metadata page. The content must not be changed in a...
متن کاملPiecewise cubic interpolation of fuzzy data based on B-spline basis functions
In this paper fuzzy piecewise cubic interpolation is constructed for fuzzy data based on B-spline basis functions. We add two new additional conditions which guarantee uniqueness of fuzzy B-spline interpolation.Other conditions are imposed on the interpolation data to guarantee that the interpolation function to be a well-defined fuzzy function. Finally some examples are given to illustrate the...
متن کامل